Best Rainfall Prediction Model Based on Skill Score

This content contains information and techniques in determining the best prediction model using the skill score method.

Eggy Pandiangan true
2024-10-10

1. INTRODUCTION

Definition and Goals of Climate Seasonal Forecast Assessment

A Climate Seasonal Forecast Assessment refers to the systematic evaluation of seasonal climate forecasts, which typically predict weather patterns over several months. This process involves analyzing the accuracy, skill, and reliability of these forecasts to determine how closely they match the actual weather outcomes (observation). The primary goal of this assessment is to verify whether the forecasted seasonal conditions, such as temperature, precipitation, and wind patterns, align with the observed climate data for the given period.

The verification aspect plays a crucial role in this assessment, as it measures the forecast’s skill and consistency. Verification involves comparing the predicted values with real-world data, often using statistical techniques to determine the forecast’s accuracy. The results of the verification help meteorologists refine forecasting models, identify areas for improvement, and better understand the underlying climatic variables that influence seasonal patterns. Validation is the confirmation through testing and providing objective evidence that certain requirements for a specific purpose are met by a model. Verification, on the other hand, is the process of comparing model calculations (forecasts) with actual values (observations), which is generally equated with validation (Jolliffe and Stephenson 2011).

The type of verification adjusts to the type of forecasts, here are some examples of the types of forecasts and verification (WWRP 2017)

Nature of forecast: Example(s): Verification methods:
deterministic (non-probabilistic) quantitative precipitation forecast visual, dichotomous, multi-category, continuous, spatial
probabilistic probability of precipitation, ensemble forecast visual, probabilistic, ensemble
qualitative (worded) 5-day outlook visual, dichotomous, multi-category
Space-time domain:
time series daily maximum temperature forecasts for a city visual, dichotomous, multi-category, continuous, probabilistic
spatial distribution map of geopotential height, rainfall chart visual, dichotomous, multi-category, continuous, probabilistic, spatial, ensemble
pooled space and time monthly average global temperature anomaly dichotomous, multi-category, continuous, probabilistic, ensemble
Specificity of forecast:
dichotomous (yes/no) occurrence of fog visual, dichotomous, probabilistic, spatial, ensemble
multi-category cold, normal, or warm conditions visual, multi-category, probabilistic, spatial, ensemble
continuous maximum temperature visual, continuous, probabilistic, spatial, ensemble
object- or event-oriented tropical cyclone motion and intensity visual, dichotomous, multi-category, continuous, probabilistic, spatial

Allan Murphy (Murphy 1993), a pioneer in the field of prediction verification, distinguishes 3 types of ‘goodness’ of a prediction :

In addition, Murphy also mentioned that there are 8 aspects (attributes) that show the quality of a forecast, including:

No. Aspects/Attributes Description Scores/metrics that can be used
1 Bias Difference between predicted and observed means Mean Error (ME), Mean Square Error (MSE), Scatter plot
2 Accuracy The degree of agreement between prediction and observation Continuous Rank Probability Score (CRPS), Taylor Diagram
3 Uncertainty The diversity of observation values, the greater the uncertainty of the observation, the more difficult it is to predict Verification Rank Histogram (VRH)
4 Sharpness The ability of the forecast to predict extreme values. Sharpness is “only” possessed by predictions (not observations); even poor predictions still have the attribute of sharpness Ensemble Spread (SPRD), Spread Skill Relationship (SSR)
5 Resolution The ability of a forecast to represent how much the prediction differs from the probabilistic mean of the climatology of an event and whether the prediction system can predict it correctly; Typically used to measure the mean square of probabilistic prediction error Brier Score (BS), Relative Operating Characteristic (ROC)
6 Discrimination The forecast’s ability to clearly distinguish a situation that leads to the occurrence or non-occurrence of an event Relative Operating Characteristic (ROC)
7 Reliability Statistical consistency between the probabilistic prediction of an event and the actual frequency of occurrence Reliability Diagram (RD), Brier Score (BS)
8 Skill The relative accuracy of a prediction model to a reference (climatological conditions) or an increase in prediction accuracy due to an improved prediction system; Reliability is used to measure the superiority of a prediction system based on a baseline of past observations Skill Score (SS): BSS, CRPSS, ROCSS

Examples of Evaluations for Climate Seasonal Forecasts

Some meteorological agencies in other countries have evaluated their predictions using several verification metrics, for example:

\(~\)

2. HOW IT IS EVALUATED?

Approaches for Assessing Climate Seasonal Forecast

There are several approaches that are commonly used to assess seasonal climate forecasts (Wilks 2011; Jolliffe and Stephenson 2011; WWRP 2017), such as:

No. Metric Formulation Description
1 Mean Error (ME) \(\text{ME}=\frac{\sum_{i=1}^{n}\left(f_i-o_i\right)}{n}\) \(n=\) Number of data pairs (forecast & observation); \(f_i=\) Forecast value; \(o_i=\) Observatioin value; Best Score \(=0\); Verification method = continuous;
2 Mean Absolute Error (MAE) \(\text{MAE}=\frac{\sum_{i=1}^{n}\left|f_i-o_i\right|}{n}\) \(n=\) Number of data pairs (forecast & observation); \(f_i=\) Forecast value; \(o_i=\) Observatioin value; Best Score \(=0\); Verification method = continuous;
3 Root Mean Square Error (RMSE) \(\text{RMSE}=\sqrt{\frac{\sum_{i=0}^{N - 1} (f_i - o_i)^2}{N}}\) \(N=\) Number of data pairs (forecast & observation); \(f_i=\) Forecast value; \(o_i=\) Observatioin value; Best Score \(=0\); Verification method = continuous;
4 Boxplot Boxplot \(Q_1=(n+1)*0.25\); \(Q_2=(n+1)*0.5\); \(Q_3=(n+1)*0.75\); \(IQR=Q_3-Q_1\); Best Score = Closest to Observation; Verification method = continuous, visual;
5 Scatter plot scatterplot Best Score = gather around the diagonal; Verification method = continuous, visual;
6 Correlation Coefficient \((r)\) \(r=\frac{\sum{(f_i-\overline{f})(o_i-\overline{o})}}{\sqrt{\sum{(f_i-\overline{f})^2}}\sqrt{\sum{(o_i-\overline{o})^2}}}\) \(f_i=\) i-th forecast value; \(\overline{f}=\) mean of forecast values; \(o_i=\) i-th observation value; \(\overline{f}=\) mean of observation values; Best Score = 1; Verification method = continuous;
7 Dichotomous Contingency Table dic_con_table \(\text{HR} = \frac{a}{a + c} = \hat{p}(\hat{x} = 1 \mid x = 1)\);
\(\text{PC or Accuracy} = \frac{a + d}{n} = \hat{p}\left[(\hat{x} = 1, x = 1) \, \text{or} \, (\hat{x} = 0, x = 0)\right]\);
\(\text{FAR} = \frac{b}{a + b} = \hat{p}(x = 0 \mid \hat{x} = 1)\);
\(\text{CSI} = \frac{a}{a + b + c}\);
ROC
8 Multi-Category Contingency Table dic_con_table \(\text{Accuracy} = \frac{1}{N} \sum_{i=1}^{K}n(F_i, O_i)\);

\(\text{HSS} = \frac{\frac{1}{N} \sum_{i=1}^{K} n(F_i, O_i)-\frac{1}{N^2} \sum_{i=1}^{K} N(F_i) N(O_i)}{1 - \frac{1}{N^2} \sum_{i=1}^{K} N(F_i) N(O_i)}\);


Best Score Accuracy=1; Best Score HSS=1

Skill Score Method

Prediction of the onset of the Indonesian season by BMKG for both the wet and dry seasons is based on rainfall values obtained from the output of several models. These models consist of dynamic and statistical models that have their own advantages and disadvantages. The number of models used will certainly cause more uncertainty in the prediction of season onset. To overcome this, several things can be done :

Some of the metrics previously described can be used to evaluate the rainfall output of models used in seasonal prediction. Here we will use an evaluation metric called the Taylor Skill Score (TSS)(Xiaoli Yang and Sheffield 2020) or commonly called Skill Score (SS), whose basis is the Taylor Diagram. The Taylor Diagram has 3 output results (Correlation Coefficient, RMSE, and \({NSD}_m\)) drawn in one graph. If referring to each of the Taylor Diagram outputs, it will certainly be difficult to determine the best model. TSS combines the 3 outputs that produce a value that states the skill of the rainfall prediction model being evaluated.

\[\begin{equation} \tag{1} SS = \frac{4(1 + CC)^4}{\left( \text{NSD}_m + \frac{1}{\text{NSD}_m} \right)^2 (1 + CC_0)^4} \end{equation}\] \[\begin{equation} \tag{2} \text{NSD}_m = \frac{\sigma_{\text{mod}}}{\sigma_{\text{obs}}} \end{equation}\]

Where: \[ \text{NSD}_m = \text{Normalized standard deviation of the simulation (model)} \]

\[ \sigma_{\text{mod}} = \text{Standard deviation of the simulation (model)} \]

\[ \sigma_{\text{obs}} = \text{Standard deviation of the observation} \]

\[ m = \text{Model value simulation} \]

\[ \text{CC}_0 = \text{Maximum correlation coefficient} \]

\[ \text{CC} = \text{Correlation coefficient between the simulated (model) and observed data} \]

The closer SS is to 1, the better the ability of the individual model to represent the observations

\(~\)

3. RAINFALL PREDICTION EVALUATION (FOR SEASON ONSET DETERMINATION)

Assessment at ZOM 9120

Here is an example of rainfall forecast data from several models (Initial January 2011) and observations, for 21 dekad days (ZOM Aceh_01):

CODE TO CREATE A TAYLOR DIAGRAM

library(openair)
library(Metrics)
library(tidyr)
#dt <- read.csv('path/to/your/data.csv)
dt1 <- pivot_longer(data = dt, cols = colnames(dt[!colnames(dt) %in% c('MODEL')]), names_to = 'DAS', values_to = 'CH')
dt1$MODEL <- factor(dt1$MODEL, levels = unique(dt1$MODEL))
dt1$DAS <- factor(dt1$DAS, levels = unique(dt1$DAS))
t_dt <- as.data.frame(t(dt)); nms <- t_dt[1,]; t_dt <- t_dt[2:nrow(t_dt),]; colnames(t_dt) <- nms; t_dt$DAS <- rownames(t_dt)
t_dt[!colnames(t_dt) %in% c('DAS')] <- as.data.frame(sapply(t_dt[!colnames(t_dt) %in% c('DAS')], as.numeric))
t_dt2 <- t_dt
t_dt1 <- as.data.frame(pivot_longer(data = t_dt, cols = colnames(t_dt[!colnames(t_dt) %in% c('DAS','OBS')]), names_to = 'MODEL', values_to = 'CH'))
t_dt1[!colnames(t_dt1) %in% c('MODEL','DAS')] <- sapply(t_dt1[!colnames(t_dt1) %in% c('MODEL','DAS')], as.numeric)
t_dt1$MODEL <- factor(t_dt1$MODEL, levels = unique(t_dt1$MODEL))
p2 <- TaylorDiagram(t_dt1, obs = "OBS", mod = "CH", group = "MODEL", normalise = T, cols = c('cornflowerblue','blue','yellow2','deeppink','deeppink4','orange1','orange4'),
                    text.obs = 'Observasi', key = T, main=paste0('Taylor Diagram'))

ECMWF_RAW ECMWF_COR MME1 CFSv2_RAW CFSv2_COR ARIMA WARIMA
RMSE 42.9393894 32.8570424 38.5247122 46.4539831 31.4249754 43.5126742 42.0122118
CC 0.6972082 0.7312276 0.5420013 0.4109492 0.7422360 0.3165167 0.4026214
NSDm 0.6112191 0.5841298 0.6217545 0.6297243 0.7395452 0.3083050 0.3535605
SS 0.4107342 0.4259743 0.2842024 0.2014521 0.5264507 0.0595302 0.0955697

Some of the metrics previously described can be used to evaluate the rainfall output of models used in seasonal prediction. Here we will use an evaluation metric called the Taylor Skill Score (TSS) or commonly called Skill Score (SS), whose basis is the Taylor Diagram. The Taylor Diagram has 3 output results (correlation coefficient, RMSE, and STD) drawn in one graph. If referring to each of the Taylor Diagram outputs, it will certainly be difficult to determine the best model. TSS combines the 3 outputs that produce a value that states the skill of the rainfall prediction model being evaluated.

Using the Skill Score method to Interpret the Evaluation

Boxplot dari SS.

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry’s standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

4. RAINFALL WEIGHTING

Seasonal Rainfall Prediction Weighting based on Skill Score

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry’s standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

Seasonal Rainfall Prediction ZOM 9120 using Skill Score-based weighting

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry’s standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

5. SIMULATION OF SKILL SCORE METHOD

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry’s standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

Var1 Freq
ECA 3
Economics 2
GeoSciences 1
LAW 6
PPLS 5
SLLC 1
SSPS 2

Jolliffe, I. T., and D. B. Stephenson. 2011. Forecast Verification: A Practitioner’s Guide in Atmospheric Science. Wiley. https://books.google.co.id/books?id=sgwIEAAAQBAJ.
Murphy, Allan H. 1993. “What Is a Good Forecast? An Essay on the Nature of Goodness in Weather Forecasting.” Weather and Forecasting 8 (2): 281–93. https://doi.org/10.1175/1520-0434(1993)008<0281:WIAGFA>2.0.CO;2.
Wilks, Daniel S. 2011. Statistical Methods in the Atmospheric Sciences. Amsterdam; Boston: Elsevier Academic Press. https://www.amazon.com/Statistical-Atmospheric-Sciences-International-Geophysics/dp/0123850223/ref=pd_bxgy_14_img_3?_encoding=UTF8&psc=1&refRID=ESPQQ0R2PB1TP1VJSGCZ.
WWRP. 2017. “WWRP/WGNE Joint Working Group on Forecast Verification Research.” https://www.cawcr.gov.au/projects/verification.
Xiaoli Yang, Yuqian Wang, Xiaohan Yu, and Justin Sheffield. 2020. “The Optimal Multimodel Ensemble of Bias-Corrected CMIP5 Climate Models over China.” Journal of Hydrometeorology 21 (4): 845–63. https://doi.org/10.1175/JHM-D-19-0141.1.

References